home *** CD-ROM | disk | FTP | other *** search
- Hi.
-
- This is a short tutorial on the radix-sort. If you already know what a radix-sort
- is, how it works and you don't use quicksort anymore you can skip this
- tutorial.
-
- The tutorial is intended for programmers who have to write a fast sorting
- algorithm for small- and medium-sized lists. So if you want to sort a
- database you should search the web for another tutorial. However, if you
- write a 3d-engine or such stuff you should read on because radix-sort is very
- good for this purpose.
-
-
- THE BASICS
- ----------
-
- How do you sort a list of numbers? Basically you have an algorithm which
- compares numbers and tries to get the list more and more sorted.
-
- The Radix-sort is a little bit different. It doesn't compare anything, it
- sorts the numbers with several passes.
-
- Let's start with a list of bytes. A very simple radix-sort for bytes
- would work like this:
-
- source: List of bytes
- source_n: number of bytes to sort
- dest[256]: 256 lists of bytes. each list should have enough space to hold
- source_n elements.
-
- Here is pseudo-code to sort the list:
-
- for i = 0 to source_n do
- dest[source[i]].append (source[i])
- endfor
-
- This means you append all items with value #1 to list 1, all items with
- value #2 to list 2 and so on. As you can see this algorithm runs very fast
- (it only copies the bytes in a very tight loop). The disadvantage is that
- you need a lot of memory to sort the list becuase you don't know how many
- items of each type you have...
-
-
-
- SAVING MEMORY
- -------------
-
- Do you really not know how many items of each type you have? No, you can
- count them and optimize your memory requirements. We'll need another loop
- to count the distribution of the source values:
-
- int distribution[256]
-
- // fill the list with zeros.
-
- for i=0 to 255 do
- distribution[i]=0;
-
- // build a distribution history:
-
- for i=0 to source_n do
- distribution[source[i]] = distribution[source[i]] +1;
- endfor
-
- // Now we build an index-list for each possible element:
-
- int index[256];
- index [0]=0;
-
- for i=0 to 255 do
- index[i]=index[i-1]+distribution[i-1];
- endfor
-
-
- Adapting the original radix-sort to make use of the new information
- will result in this code:
-
- dest: array of bytes with space for source_n bytes.
-
- for i = 0 to source_n do
- dest[index[source[i]]]=source[i];
- index[source[i]] = index[source[i]] + 1;
- endfor
-
- After this dest contains the sorted bytes.
-
- Now we only need the same amount of memory as the original source list
- took up. This saves quite a lot of memory, and if your lists don't contain
- millions of values we can live with the memory requirements.
-
-
- EXTENDING THE BYTE-SORT
- -----------------------
-
- Who wants to sort bytes? You? I normally sort ascii text, floats or
- long ints. Because I only want to explain how the extension works, I'll
- go on with long ints.
-
- You can increase the size of the elements in the sort in several ways.
- You can either increase the size of index and distribution or you sort in
- several passes.
-
- Increasing the size of the index only works for short values (sorting
- short ints would require at least 65536*2 bytes for the index and
- distribution lists, which finally need 256 kb of memory... I think this is
- too much (just think about how long it takes to clear the memory and build
- the index-list.. In this time a quick sort may have sorted your list
- already).
-
- Now we need a clever way to sort more bits with the same memory amount.
-
- We can do this with several passes. Let's do it in decimal.
-
- Unsorted list:
-
- 523
- 153
- 088
- 554
- 235
-
- Sorted for Radix 0 (least significant digit):
-
- 523
- 153
- 554
- 235
- 088
- ^
-
- Sorted for Radix 1 (2nd. significant digit):
-
- 523
- 235
- 153
- 554
- 088
- ^
-
- Sorted for Radix 2 (most. significant digit):
-
- 088
- 153
- 235
- 523
- 554
- ^
-
- As you can se we preserve the old sorting order and advance by one digit
- until we've sorted all digits. The illustration did this in decimal. The
- native digit format for computers are bits, bytes, words and dwords.
-
- C Source
- --------
-
- I know... real programmers code in assembler. (Hey! I thought you want to
- understand the radix-sort, not assembly). The program compiles well under
- Watcom C++. However, it should at least work on all 32-bit C compilers and
- on some 16-bit compilers.
-
-
-
- ----------------------------- cut here ---------------------------
-
- #include <iostream.h>
- #include <stdlib.h>
- #include <string.h>
-
- void radix (int byte, long N, long *source, long *dest)
- {
- long count[256];
- long index[256];
- memset (count, 0, sizeof (count));
- for ( int i=0; i<N; i++ ) count[((source[i])>>(byte*8))&0xff]++;
-
- index[0]=0;
- for ( i=1; i<256; i++ ) index[i]=index[i-1]+count[i-1];
- for ( i=0; i<N; i++ ) dest[index[((source[i])>>(byte*8))&0xff]++] = source[i];
- }
-
- void radixsort (long *source, long *temp, long N)
- {
- radix (0, N, source, temp);
- radix (1, N, temp, source);
- radix (2, N, source, temp);
- radix (3, N, temp, source);
- }
-
- void make_random (long *data, long N)
- {
- for ( int i=0; i<N; i++ ) data[i]=rand()|(rand()<<16);
- }
-
-
- long data[100];
- long temp[100];
-
- void main (void)
- {
- make_random(data, 100);
- radixsort (data, temp, 100);
- for ( int i=0; i<100; i++ ) cout << data[i] << '\n';
- }
-
- ----------------------------- cut here ---------------------------
-
- LAST WORDS
- ----------
-
- I hope you learned something.
-
- I have to thank Climax from Amable for making me interested in this
- algorithm (well, two weeks before I thought quick sort is the best).
-
- Also thanks go out to Peter Flynn who took a look at this file and
- corrected all my type-erros. Thank you!
-
- If you want to contact me write to
-
- Nils Pipenbrinck / Submissive from Cubic Team & $eeN
- 100121.2450@compuserve.com (checked daily)
- subcode@tecs.de (checked at least once a week)
-